NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Correlated Errors in Large Language Models

Kim, Elliot_Myunghoon; Garg, Avi; Peng, Kenny; Garg, Nikhil (June 2025, International Conference on Machine Learning)

Diversity in training data, architecture, and providers is assumed to mitigate homogeneity in LLMs. However, we lack empirical evidence on whether different LLMs differ \textit{meaningfully}. We conduct a large-scale empirical evaluation on over 350 LLMs overall, using two popular leaderboards and a resume-screening task. We find substantial correlation in model errors---on one leaderboard dataset, models agree 60% of the time when both models err. We identify factors driving model correlation, including shared architectures and providers. Crucially, however, larger and more accurate models have highly correlated errors, even with distinct architectures and providers. Finally, we show the effects of correlation in two downstream tasks: LLM-as-judge evaluation and hiring---the latter reflecting theoretical predictions regarding algorithmic monoculture.
more » « less
Free, publicly-accessible full text available June 18, 2026
Sparse Autoencoders for Hypothesis Generation

Movva, Rajiv; Peng, Kenny; Garg, Nikhil; Kleinberg, Jon; Pierson, Emma (June 2025, International Conference on Machine Learning)

We describe HypotheSAEs, a general method to hypothesize interpretable relationships between text data (e.g., headlines) and a target variable (e.g., clicks). HypotheSAEs has three steps: (1) train a sparse autoencoder on text embeddings to produce interpretable features describing the data distribution, (2) select features that predict the target variable, and (3) generate a natural language interpretation of each feature (e.g., mentions being surprised or shocked) using an LLM. Each interpretation serves as a hypothesis about what predicts the target variable. Compared to baselines, our method better identifies reference hypotheses on synthetic datasets (at least +0.06 in F1) and produces more predictive hypotheses on real datasets (~twice as many significant findings), despite requiring 1-2 orders of magnitude less compute than recent LLM-based methods. HypotheSAEs also produces novel discoveries on two well-studied tasks: explaining partisan differences in Congressional speeches and identifying drivers of engagement with online headlines.
more » « less
Free, publicly-accessible full text available June 18, 2026
Sparse Autoencoders for Hypothesis Generation

Movva, Rajiv; Peng, Kenny; Garg, Nikhil; Kleinberg, Jon; Pierson, Emma (May 2025, ICML)

Free, publicly-accessible full text available May 30, 2026
A No Free Lunch Theorem for Human-AI Collaboration

https://doi.org/10.1609/aaai.v39i13.33574

Peng, Kenny; Garg, Nikhil; Kleinberg, Jon (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

The gold standard in human-AI collaboration is complementarity: when combined performance exceeds both the human and algorithm alone. We investigate this challenge in binary classification settings where the goal is to maximize 0-1 accuracy. Given two or more agents who can make calibrated probabilistic predictions, we show a No Free Lunch-style result. Any deterministic collaboration strategy (a function mapping calibrated probabilities into binary classifications) that does not essentially always defer to the same agent will sometimes perform worse than the least accurate agent. In other words, complementarity cannot be achieved for free. The result does suggest one model of collaboration with guarantees, where one agent identifies obvious errors of the other agent. We also use the result to understand the necessary conditions enabling the success of other collaboration techniques, providing guidance to human-AI collaboration.
more » « less
Free, publicly-accessible full text available April 11, 2026
Mitigating dataset harms requires stewardship: Lessons from 1000 papers

Peng, Kenny; Mathur, Arunesh; Narayanan, Arvind (December 2021, Advances in neural information processing systems)

Machine learning datasets have elicited concerns about privacy, bias, and unethical applications, leading to the retraction of prominent datasets such as DukeMTMC, MS-Celeb-1M, and Tiny Images. In response, the machine learning community has called for higher ethical standards in dataset creation. To help inform these efforts, we studied three influential but ethically problematic face and person recognition datasets—Labeled Faces in the Wild (LFW), MS-Celeb-1M, and DukeMTMC— by analyzing nearly 1000 papers that cite them. We found that the creation of derivative datasets and models, broader technological and social change, the lack of clarity of licenses, and dataset management practices can introduce a wide range of ethical concerns. We conclude by suggesting a distributed approach to harm mitigation that considers the entire life cycle of a dataset.
more » « less
Full Text Available
Mitigating dataset harms requires stewardship: Lessons from 1000 papers

Peng, Kenny; Mathur, Arunesh; Narayanan, Arvind (December 2021, Advances in neural information processing systems)

Full Text Available

Search for: All records